AD  -  A  1 44  7  11  KOLMOGOROV  SMIRNOV  GOODNESS-OF-F IT  TEST:  CORRECTED  FOR 
USE  WITH  ' EEG-L IKE'  DAT  A I U )  NAVAL  BIODYNAMICS  LAB  NEW 
ORLEANS  LA  M  S  WEISS  APR  84  NBDL  84R003 
UNCLASSIFIED  '  F/G  12/1 


viCKocopy  kim'lli’m 


NBIM.-84R003 


KOLMOGOROV- SMIRNOV  GOODNESS-0F-FIT  TEST: 
CORRECTED  FOR  USE  WITH  "EEG-UKE“  DATA 

M.  S.  Weiss 


April  1°34 


□u' 

o 

CD 


NAVAL  BIODYNAMiCS  LABORATORY 
New  Orleans,  Louisiana 


Approved  for  public  release.  Distribution  unlimited. 

pf  OZ  Zf  62 S 


I'Nl'LASS  1  F  1  KD _ 

SECURITY  CLASSIFICATION  of  This  PAGE  fRTian  Data  EnleieJ; 

f  REPORT  DOCUMENTATION  PAGE 


REPORT  NUMBER 

KBDL-8^003 


pipe  ;  KK\L>  I^SIPUCI ION’S 

rAUC  !  hi:m)hk  completing  form 

^2.  GOVT  AC C  FSSIOn  NO*  3  of  r  . p •  r  ,,  «  ’  a  '  •  .  -i 'j  n  j m M ? f> 


/W  in 


4.  T|  Tl  E  (end  Subtitle) 

Kolmogorov-Smi rnov  Goodness-of-Fit  Test: 
Corrected  for  Use  with  "EEG-I.ike"  Data 


17.  AUTHORflJ 


I  '  T-.Fr  '  f  -KI  rn'  »  FEr"'  O'.  :  .  EREO 


J  ,-’r-s(. arch  terror t 


6  Pturc-IM  .J  >r.  V  nft;BT  •.  .wSER 


VBDL-84*0G3 


a  CC T’Bef  0°  Ti. 


v . c .  We i s  s 

»  PERFORMING  ORGANIZATION  NAME  AND  AODRESS 

Naval  Biodynamics  Laboratory 
Box  29407 

New  Orleans,  LA  70189 

II.  CONTROLLING  OFFICE  NAME  AND  ADDRESS 

Naval  Medical  Research  4  development  Command 


Bethesda,  Ml)  20814 


RR C'lVu  F_EmE-.'  frOjEC*  ’»S< 
ARE*  A  '  Y.VBERS 


M0097PN001-S004 


1  Anri  1  1984 


U.  MONITORING  AGENCY  NAME  ft  ADDRESSf//  different  from  Controlling  Office )  j 


1 19.  KEY  WOROS  (Continue  on  revarae  aide  It  naceaamry  and  Identify  by  block  number) 


Kolmogorov-Smi rnov ,  goodness-of-fit ,  EEC,  simulation,  autoregressive 


X  20.  ABSTRACT  (Continue  on  reverae  aide  It  neceaaary  and  Identity  by  block  number) 

The  one-sample  Kolmogorov-Smirnov  goodness-of-fit  test  (KS)  is  designed  for 
use  with  independent  data  and  can  be  highly  sensitive  to  correlated  data. 
Standard  critical  values  for  KS  cannot  be  used  with  data  with  known  correla¬ 
tions.  For  data  with  EEG-like  spectra  (low  frequency,  high  amplitude  spectral 
peaks)  an  empirically  derived  correction  for  KS  provides  correct  critical 
values  and  retains  much  of  the  power  of  the  original  KS.  The  correction  is 
based  on  a  simple  quadratic  expression  involving  a  parameter , fre,  computed 
from  zero-crossing  measurements,  n  '■'( 


DO  I  JAN  7]  1473  EDITION  OF  I  NOV  «S  IS  OBSOLETE 

5  N  0102-  LF-  0 1  J-  6601 


UNCLASSIFIED  W 

SECURITY  CL  ASSIFIC  ATIQn’oT  This  PAGE  (RTun  Oat*  tnffd) 


NBDL-84RG03 


KOLMOGOROV -SMIRNOV  GOODNESS-OF-FIT  TEST: 
CORRECTED  FOR  USE  WITH  "EEG-LIKE"  DATA 

M .  S .  Weiss 


April  1984 


Naval  Medical  Research  and  Development  Command 
Research  Work  Unit  No.  M0097PN001-5004 


Approved  by 


Released  by 


J.  C.  Guignard 

Chairman,  Editorial  Review  Board 


Captain  L.  E.  Williams,  MC,  USN 
Comtranding  Officer 


Naval  Biodynamics  Laboratory 
New  Orleans,  Louisiana 


Opinions  or  conclusions  contained  in  this  report  are  those  of  the  author  and 
do  not  necessarily  reflect  the  views  or  the  endorsement  of  the  Department  of 
the  Navy.  Approved  for  public  release;  distribution  unlimited.  Reproduction 
in  whole  or  in  part  is  permitted  for  any  purpose  of  the  United  States 
Government. 


SUMMARY 


THE  PROBLEM 

The  standard  Kolmogorov-Smirnov  statistical  test  for  goodness-of-fi t  (KS) 
requires  that  the  data  being  tested  come  from  independent  samples.  The  use  of 
this  test  with  highly  correlated  time  series  data  such  as .EFG  data  is 
inappropriate  and  yields  erroneous  results.  A  procedure  is  required  to 
correct  KS  for  the  influence  of  correlation  among  the  sampled  data  without 
sacrificing  statistical  power. 


FINDINGS 

For  data  with  EEG-like  spectra  (low  frequency,  high  amplitude  spectral 
peaks)  an  empirically  derived  correction  for  KS  provides  correct  critical 
values  and  retains  much  of  the  power  of  the  original  KS.  The  correction  is 
based  on  a  simple  quadratic  expression  involving  a  parameter,  x,  computed  from 
zero  crossing  measurements. 


Kolmogorov-Smi rnov  Goodness-of-Fi t  Test: 
Corrected  for  use  with  "EEG-Like"  Data 


I.  INTRODUCTION 


Traditional  methods  of  computerized  time-series  analysis  (e.g.,  spectral 
analysis,  period  analysis)  as  well  as  more  advanced  multi-variate  statistical 
analyses  rely  on  assumptions,  techniques  and  methodology  developed  for 
situations  where  the  underlying  statistics  of  the  data  to  be  analyzed  are 
reasonably  well  understood.  This  is  not  true  for  electroencephalographic 
(EEG)  time-series  data  where  uncritical  adoption  of  traditional  approaches 
to  the  problem  of  EEG  analysis  may  be  inappropriate.  The  use  of  computer 
simulated  EEG  data  provides  a  means  of  attacking  this  problem  by 
quantitatively  evaluating  and  comparing  analytic  procedures.  It  can  also  lead 
to  the  development  of  new  procedures  which  are  valid  for  time  series  data  with 
properties  similar  to  those  of  the  EEG  (EEG-like  data). 

The  results  reported  here  describe  a  modification  of  the  Kolmogorov-Smi rnov 
statistical  test  (KS)  for  normally  distributed  time  series  data  when  these 
data  are  highly  correlated  with  spectral  characteristics  similar  to  those  of 
EEG  data.  A  general  purpose  mini-computer  (DGC  Eclipse®  S/140)  was  used  to 
generate  and  analyze  simulated  EEG-like  time  series  using  specially  designed 
software  (16,  17). 

The  KS  for  goodness-of-fit  between  a  theoretical  distribution  and  a  sample 
set  of  observations  is  well  known  and  widely  used.  It  is  defined  as: 

KS  =  max|F(x)-Sn(x) |  (1.1) 

where  F(x)  is  a  population  distribution  function  and  Sn(x)  is  the  sample 
distribution  step-function.  For  continuous  F(x)  the  sampling  distribution  of 
KS  is  known  and  is  independent  of  F(x).  For  the  particular  case  of  F(x) 
normal  with  mean  and  variance  estimated  from  the  sample,  Monte  Carlo 
corrections  for  critical  values  of  KS  have  been  obtained  (6,  12,  14).  In  the 
past  decade  there  has  been  some  attention  devoted  to  applying  KS  to  sampled 
EEG  data  (e.g.  7,  8,  13).  These  data,  as  most  time-series  data,  can  be  highly 
correlated.  Correct  application  of  the  KS  (or  any  other  statistic  that 
demands  independent  data)  requires  that  the  effective  sampling  rate  be  reduced 
by  discarding  data  in  order  to  insure  an  uncorrelated  ("white  noise”)  sample 
(7).  In  practice,  the  original  sample  length  is  often  of  fixed  duration  so 
this  procedure  results  in  a  decreased  sample  size  and  a  consequent  reduction 
in  the  power  of  the  test.  Ideally,  an  estimate  of  the  correlation  properties 
of  the  data  could  be  used  to  correct  the  test  statistic,  eliminating  the  need 
for  discarding  data. 

A  general  procedure  using  this  approach  has  been  developed  and  used  to 
construct  a  corrected  KS.  This  correction  enables  the  use  of  all  the  data  as 
originally  sampled,  increasing  the  power  of  the  test.  The  results  are 
applicable  to  EEG-like  time  series  data  and  are  an  extension  of  work 
previously  reported  (14,  15). 


1 


II.  METHOD 


The  simulated  EEG  time  series  were  generated  by  linearly  combining 
second-order  autoregressive  (AR)  series.  Though  the  autoregressive  model  has 
been  widely  used  and  extensively  developed  (1,  3),  some  pertinent  ideas  and 
results  are  outlined  here.  A  time  series,  which  is  a  set  of  sequential 
observations  of  some  (stochastic)  process,  is  said  to  be  autoregressive  of 
order  p  if  each  observation,  X^  can  be  expressed  as: 


xt  *  a^t-ts  +  a?xt-2ts  +  . +  apxt-pts  +  ap+lEt  (2.1) 


where  each  E^  is  an  independent  sample  from  a  zero  mean,  unit  variance,  random 
process  with  an  arbitrary  distribution  function  and  ts  is  the  time  between  the 
successive  observations  X^-t  an<l  xt*  The  second-order  AR  process  is  of 

particular  theoretical  and  practical  importance  and  can  be  expressed  as: 


xt  =  a!Xt-ts  +  a;xt-2t$  +  a3Et 


[2.2) 


with  the  following  stability  conditions  required  for  the  process  described  by 
(2.2)  to  remain  bounded: 


|aj  <  1 

|a.|  <  l-a2 


:2.3) 


The  mean  of  Xf  is  zero  and: 

ox2  =  a  2(l-a  )/[{(l-a  )2  -  a*2}  (1+a.) 


(2.4] 


The  normalized  autocorrelation  function  corresponding  to  (2.2),  for 


aL  +  4a?<  0,  ’$: 


|k|/2 


Pk  =  (-a2)  cos(uj0k  -  4>0 ) /cos  <j>0)  k=  0,±1,±2, .  (2.5) 

1/2 

where:  cos  w0  =  a x /2( -a 2 ) 

1+az 

tan  _ _  tan  u 

°  TTa  ° 

2 


2 


The  corresponding  expression  for  the  one-sided,  power  spectral  density 
function  is: 

P(f)  =  2tsa//D(f) 

where  0(f)  =  1  +  a^  +  a92  -  2a,(l-a,)cos  2rrf ts  -  2a2cos  4nfts  (2.6) 

The  autocorrelation  functions  described  in  (2.5)  consist  of  damped  sine 
waves  and  exponentials  and  consequently  can  be  used  to  describe  a  wide  variety 
of  natural  phenomena,  including  EEG  activity  (2,  4,  9).  In  addition, 
examination  of  (2.6)  shows  that  for  | a , ( 1 -a  ) j  <  |4a?]  and  a  <  0,  the  spectrum 

will  have  a  peak  at  «j0  if  cos  w0ts  =  a,{a  -l)/4a  .  Thus  processes  such  as  EEG 

activity,  which  have  peaked  spectra,  can  often  be  approximately  represented  by 
a  combination  of  second-order  AR  series.  The  simplest  procedure  is  to  use  an 
independent  linear  combination  of  such  series  with  each  series  selected  to 
represent  one  of  the  desired  spectral  peaks.  By  appropriately  selecting  each 
a 3  (2.2)  the  relative  amplitude  of  each  peak  can  be  specified.  The  resultant 
time  series  for  up  to  five  independent  peaks  is  represented  by  the  sum: 

N 

St  '  L  a,xt  -t  +  ao  xt  -2t  +  a,  Et  l£  N  <5  (2.7) 

n=i  n  n  s  2n  s  s  n  n 

The  corresponding  spectrum  is  simply: 

N 

F(f)  =  L  Pn(f)  1<  N  <5  (2.8) 

n=l 


which  is  a  linear  combination  of  spectra  of  the  form  in  (1.6),  with  the 
relative  weights  determined  by  appropriate  selection  of  a  . 

3n 

The  actual  procedures  involved  selecting  the  location,  relative  amplitude 
and  bandwidth  (fn,  rn,  bn)  for  each  peak.  The  first  two  parameters  in  (2.7) 
were  chosen  by  computing: 


a  =  .lbn  -  1 


a,  =  4a,  cos  fn/(a.,  -  1) 

Ln  2n  2n 


(2.9) 


A  more  precise  form  for  the  half-power  bandwidth  is  a  =  .06bn  -  .9 

2n 


3 


The  third  Diameter,  a  was  computed  so  that  the  total  variance  of  tne 

n 

resultant  time  series  [2.1)  was  rn  and  the  amplitude  of  (2.7)  at  each  neat 
was  proportional  to  rn: 


a  -  n r  n  -1  p  (  f  p ) 

'"I 

wher? 

:!  U 

kn  -  rn/  (  rn'ln{  fn)  ( ( l-a  )  -a  )  { l a- a  )/(l-a  )1 

u-.  n=’- 


Equations  (2.7)  -  (2.10)  were  used  to  simulate  a  variety  of  EEGs  using 
both  normal  and  non-normal  distributions  for  E^.  The  details  of  the  basic 
Monte  Carlo  procedure  used  in  these  simulations,  are  described  elsewhere  (14, 
16).  The  parameters  for  each  of  these  time-series  are  detailed  in  Table  1. 
These  simulated  data  were  then  used  to  compute  empirical  KS  statistics. 
°revious  results  (14,  16)  showed  that  for  normally  distributed  there  was  no 
simple  relationship  between  the  empirical  KS  statistic  (KSe)  and  the 
autoregressive  parameters  in  (2.7).  However,  series  with  narrow  low  frequency 
peaks  in  the  spectra  (2.8)  yielded  larger  :<Se  values  than  series  with  broader 
or  higher  frequency  spectral  peaks.  Two  spectral  parameters  reflected  this 
effect:  m  ,  the  square  root  of  the  second  moment,  and  m  ,  the  fourth  root  of 

the  fourth  moment  of  (2.8).  In  particular,  the  difference  1  fro  -1/m  was 

monotcnically  related  to  \Se.  Using  this  result,  a  dimensionless  oar3--gf'<* 
was  defined  as: 


=  (l/ni.-l  /r.ij/ts 


For  a  given  set  of  simulated  MIG  samples,  the  population  soectral  v. -:of:s  . 

-  can  be  computed  directly  from  (2.«)  by  integration.  In  actual  ’••racli 
however,  this  is  not  possible  since  the  population  parameters  are  unknown  an  i 
estimates  must  made  from  sample  parameters.  An  alternative  to  using  sample 
spectral  estimates  for  computing  :  lies  in  the  zero  crossing  approach.  If  7n 
is  the  average  zero  crossing  rate  of  the  nth  derivative  of  a  time  series  and 
P(f)  is  the  power  spectral  density  of  the  time  series  then  the  followina  holds 
(10): 


(Zn/2) 


pz(n+] )p( f )  df 


/  ff-ip(f)  df 

IJ 


(2.12) 


4 


Py  definition,  the  moments  of  P(f)  are: 

.  f-nP(f)  df 

m  =  _ _ _  n  =  ').  1 ,2 

arts'.  -  0  since  P ( f )  is  an  even  function, 

c  n+i 

Combining  (2. IP)  and  (2.13): 


M 


2  (n+i ) 

M  ~ 
2  n 


=  (Zn/2)  ;  Mq=  1 


n  =  0,1,2, 


(2.14) 


which  can  be  rewritten  as: 

n-i 

M  =  FT  ( Z-j  /  2 )  n  =  1,2,3, .  (2.15) 

2  n  i  =0 

This  final  relationship  (2.15)  expresses  the  moments  of  the  power  spectral 
density  as  a  product  of  the  average  zero  crossing  rates  of  the  derivatives  of 
the  original  time  series.  In  particular,  for  the  second  and  fourth  moments: 


M2  .  (Z0/2)2 
M,  =  (W4)2 
Therefore: 


(2.16) 


m  =  1/2 
i  o 

",  *  <v,>"2'2 


and: 


t2  =  ( ( 2/Z  -  2/( Z  Z  ) 1 /2 )/ts 

0  0  1  3 

The  general  correction  for  KSe  takes  the  form 
KSc(a,n)  =  KSe(a,n)/G(a,n,T) 


(2.17) 


(2.18) 


where  n  Is  the  sample  size,  a  is  the  significant  level,  KSe  is  the  measured  KS 
and  KSC  is  the  corrected  KS.  KSC  has  the  same  distribution  as  the  tabulated 
KS  which  is  based  on  independent  samples.  The  general  properties  of  G  are 
such  that  G  approaches  1.0  for  t<1.0  and  increases  montonically  with 
increasing  t.  Rewriting  (2.18)  as: 


5 


(2. IP) 


G(.05,n,iJ  =  KSe(.05,n) 

KS  f . 05 , n ) 

where  c£  represents  the  empirical  estimate  of  i  using  sample  zero  crossing 
rates  in  (2.17),  the  problem  is  to  find  an  appropriate  analytic  form  for  G. 
The  solution  to  this  problem  is  the  outcome  of  the  results  reported  below. 


III.  RESULTS 


Ten  thousand  samples  of  seven  different  normally  distributed  simulated 
EEGs  (2.7)  with  n  ranqing  from  32  to  512  and  r  (2.11)  ranging  from  1.03  to 
2.80  were  generated.  KSe(a,n)  was  computed  for  a  =  .05.  An  additional  set  of 
samples  was  generated  for  an  uncorrelated  time-series  (EEG  type  8).  Table  1 
lists  the  simulated  EEG  parameters  and  Figure  1  illustrates  the  spectra  and 
autocorrelation  functions  of  each  of  the  seven  EEG  types.  A  1  east-mean-square 
fit  for  the  following  functional  forms  of  G  (2. IP)  was  computed,  using  the 
simulated  results  as  input  to  the  BMD  program  for  derivative- free  non-li' 
regression  (BMDPAR): 


Polynomial : 
Exponential : 


3 

r  (Pi  +  Po  /n  +  P3  /n^ ) T - 
i=l  *  i  1 

-T  (P, 

1  +  (P}  +  Pj/nJlx^e  e  3 


i-1 

+  P/"*) 


1,2;  k  =  1,2 


The  results  of  this  least-mean-squares  fit  were  tested  using  a  second 
independent  set  of  simulated  EEGs.  The  final  form  chosen  for  G  was: 

G  =  1.0  +  0.3/n  +  (.09  -  1.6/n)i,;2  (3.1) 

which  minimized  residuals  and  had  the  fewest  parameters. 

This  contrasts  with  previous  results  using  simpler  time  series  which 
suggested  a  linear  function  of  t  (14).  Table  2  summarizes  the  results  of  the 
second  set  of  simulations  which  led  to  the  selection  of  (3.1).  This  table 
indicates,  as  expected,  that  the  use  (or  more  appropriately,  misuse)  of  the 
uncorrected  K$e  statistic  with  correlated  data  leads  to  a  high  probability  of 
falsely  rejecting  the  true  hypothesis  of  an  underlying  normal  distribution 
(Type  I  error).  Application  of  the  correction  formula  and  use  of  the 
corrected  KSC  statistic  reduces  the  probability  of  Type  I  error,  i.e.  the 
significance  level,  to  the  appropriate  range  of  values. 

The  correction  equations,  (3.1)  and  (2.18),  were  then  applied  to  a  set  of 
non-normally  distributed  data.  Three  unimodal  non-normal  distributions  were 
used  to  generate  Et  (2.7)  and  the  resulting  distributions  for  each  of  the 
simulated  data  sets  are  illustrated  in  Figure  2  where  the  vertical  axes 
indicate  percentages.  The  bottom  row  of  this  figure  (EEG  type  8)  shows  the 
underlying  histograms  for  the  three  types  of  distributions.  Distributions  A 
and  B  are  peaked  while  C  is  truncated  and  skewed.  Sample  time  series, 
corresponding  to  each  of  these  distributions  are  illustrated  in  Figure  3. 


6 


For  each  type  of  simulated  EEG  10,000  samples  were  generated,  for  a  total  of 
21  sample  sets.  The  powe^  of  the  modified  KS  statistic  KSC,  was  measured  at 
both  the  .01  and  .05  significance  levels  and  and  is  summarized  in  Table  3. 
The  uncorrected  statistic  would,  of  course,  yield  an  artificially  greater 
power  in  all  cases  due  to  a  larger  a.  The  probability  of  a  Type  II  error 
(accepting  the  false  hypothesis  of  a  normal  distribution)  is  1  -  power  and 
obviously  depends  on  the  true  underlying  distribution.  In  all  cases,  oower 
increases  with  increasing  sample  size. 


IV.  DISCUSSION 


For  correlated  data  with  EFG-like  spectra  with  strong  peaks  in  the  lower 
portion  of  the  spectrum  KS  as  a  test  for  normality  can  be  used  by  apolying 
equations  (2.17)  -(3.1).  An  alternative  is  to  decimate  the  data  by  decreasing 
the  effective  sample  rate,  yielding  an  uncorrelated  series.  The  problem  with 
this  approach  is  that  there  is  no  a  priori  way  to  decide  how  much  data  to 
discard.  For  fixed  length  samples  this  may  result  in  very  small  samples  with 
a  corresponding  reduction  in  power,  for  the  present  data  an  attempt  to 
address  this  problem  resulted  in  Table  4.  Normal  samples  of  simulated  EEG 
were  generated  with  sample  size  equal  to  512.  Sub-samples  with  every  second, 
fourth,  eight  and  sixteenth  point  selected  were  then  analyzed  by  computing  KSC 
and  KSe  as  described  above.  When  KSe  approached  the  correct  T  and  5"- 
critical  values,  the  time-series  could  be  considered  effectively  uncorrel ated . 
Depending  on  the  EEG  type  this  could  mean  selecting  anywhere  from  every  fourth 
to  every  sixteenth  point.  Since  in  practical  situations  sample  size  is  kept 
to  a  minimum  to  maximize  stationari tv ,  this  would  not  be  a  useful  approach. 

For  example,  taking  a  sample  lenqth  of  1  second  (7)  would  mean  60  to  100 
points  at  the  Nyquist  sampling  rate.  Reducing  this  by  a  factor  of  four  to 
sixteen  would  result  in  sample  sizes  on  the  order  of  six  to  25,  clearly  too 
smal 1  . 

It  might  be  possible  to  develop  an  optimal  decimation  procedure  which 
would  retain  the  power  of  KS  for  uncorrelated  data.  Similarly,  spectral  or 
correlation  properties  other  than  t  might  yield  improved  correction 
techniques.  These  remain  problems  for  further  research.  The  current 
solution  is  an  easily  implemented  and  computationally  efficient  method  for 
correcting  the  Kolmogorov-Smirnov  one  sample  test  for  goodness-of-fi t. 


7 


n  K  P.  F  NO 1  '• 


>,  :  .  ..  ThI  :e’ik  ins,  ,  Time  Sr-t  -w  ;  Vi  .  i  .  «■  I  «■ .  :  ■ 

—  •  -  7* - -  -1~  -  -  <"■**  ■  •  ~  - - -  • 

■:!  -  c-inci  sco :  ■  hm  .  ion-jay  ,  me  . .  » 

.  Gersch,  ...  .  'yocotra'  Analysis  of  Er.0’  s  bv  .. -t rr.-.-<  ^ 
of  T::-.o  v.ries,’  ’■'athema t i cal  ioscienv.es ,  (y-A.  , 

3 .  Jenkins,  G.M. ,  and  Watts,  O.G. ,  Spectral  Analysis  and  Its  Aprl ■  at' r . 

San  Francisco:  Holden-Day,  Inc.,  la68. 

4.  Jones,  R.H.;  Crowell,  D.H.,  and  Kapuniai,  L.F.,  "A  Method  for  Detection 
Change  in  a  Time  Series  Applied  to  Newborn  FEG,"  El ectroencephal ography 
and  Clinical  Neurophysiology,  27  (1969),  436-440. 

5.  Knuth,  D.E.,  Seminumerical  Algorithms:  The  Art  of  ComDuter  Programing, 

2nd  Edition,  (1981),  2:45-56.  * 

6.  Lilliefors,  Hubert  W.,  "On  the  Kolmoqorov-Smi rnov  Test  for  Normality  with 
Mean  and  Variance  Unknown,"  American  Statistical  Association  Journal  , 

(June  1967),  399-402. 

7.  McEwen,  J.A.,  and  Anderson,  G.B.,  "Modeling  the  Stationarity  and 
Gaussianity  of  Spontaneous  Electroencephalographic  Activity,"  IEEE 
Transactions  on  Biomedical  Engineering,  BME-22  (September  1975')’,  363-364. 

8.  Persson,  J.,  "Comments  on  Estimations  and  Tests  of  EEG  Amplitude 

Hi stributions,"  Electroencephalography  and  Clinical  Neurophysioloqy,  37 
(1974),  309- 313. 

9.  Pfurtscheller,  G. ,  and  Haring,  G. ,  "The  Use  of  an  EEG  Autoreqressive  Model 
for  the  Time-Saving  Calculation  of  Spectral  Power  Density  Distributions 
with  a  Digital  Computer,"  Electroencephalography  and  Clinical 
Neurophysioiogy ,  33  (1972),  113-115. 

10.  Rice,  S.O.,  "Mathematical  analysis  of  Random  Noise,  in  Noise  and 
Stochastic  Processes,  N.  Wax  (ed.),  Dover  Pub.,  (1954),  133-2P4. 

11.  Saltzburg,  B.,  "Parameter  Selection  and  Optimization  in  Brain  Wave 
Research,  in  Behavior  and  Brain  Electrical  Activity,  Birch  and  Altshuler 
(eds.).  Plenum  Press  (1975),  127-153. 

12.  Stephens,  M.A.,  "EDF  Statistics  for  Goodness  of  Fit  and  Some  Comparisons , " 

Journal  of  the  American  Statistical  Association,  69  (September  1974), 

T3TT7 TT. - 

13.  Weiss,  M.S.,  "Non-Gaussian  Properties  of  aie  EEG  During  Sleep," 
Electroencephalography  and  Clinical  Neurophysiology,  34  (1973),  200-202. 

14.  Weiss,  M.S.  "Modification  of  the  Kolmogorov-Smirnov  Statistic  for  use 
with  Correlated  Data",  Journal  of  the  American  Statistical  Association, 

73  (1978),  872-2875. 


8 


15.  Weiss,  M.S.,  5  New  Test  for  EEG  Gaussian  Amplitude  Distribution, 
Proceedings  1F.EE/EMRG  First  Annual  Conference,  (1979),  309-31  1  . 

Io.  Weiss,  M.S.,  A  Portable,  General  Durpose  Random  Number  Generator,  NPDL 
Technical  Report,  (1984,  in  preparation! 

P.  Weiss,  FORTRAN  Subroutine  Package  for  Generating  Simulated 

Stochastic  Time-series,  NRDL  Technical  Report,  (198-4,  in  preparation)  . 


9 


TABLE  1 


Parameters 

of  Simulated  EEG 

EEG 

Type 

ai 

d2 

a3 

T 

1 

1.78 

-.80 

.062 

1.04 

-.88 

.55 

-.49 

-.73 

.32 

1.03 

2 

1.78 

-.80 

.088 

1.04 

-.88 

.56 

-.40 

-.73 

.32 

1.15 

3 

1.78 

.80 

.087 

1.02 

-.85 

.35 

-.53 

-.85 

.18 

1.46 

4 

1.50 

-.6 

.25 

1.20 

-.6 

.53 

1.78 

5 

1.65 

-.7 

.15 

1.37 

-.7 

.39 

2.10 

6 

1.78 

-.8 

.09 

.99 

-.8 

.21 

2.45 

7 

1.78 

-.8 

.09 

.95 

-.75 

.19 

2.80 

8 

0.0 

0.0 

1.0 

0.69 

TABLE  1:  This  table  list  the  autoregressive  coefficients  and  the 
computed  value  for  t  for  the  time-series  generated  by  equation 
(2.7). 


10 


TABLE  2 


Emm' rical 

significance  levels  for  corrected 
uncorrected  KS 

and 

N : 

32 

64 

128 

256 

512 

ot ! 

:  1.0 

540 

1.0 

5.0 

1 .0 

540 

1.0 

5.0 

1 .0 

^.0 

EEG 

KS 

1 

COR 

.7 

3.8 

1.0 

5.2 

1.0 

5.2 

1.6 

5.8 

1.8 

6.5 

UNC 

1.4 

B.Q 

2.3 

ft. 3 

2.3 

8.3 

3.2 

9.8 

3.4 

10.7 

2 

COR 

.9 

3.9 

1.1 

5.4 

1.1 

5.4 

1.4 

6.7 

1.8 

7.0 

UNC 

1.7 

6.5 

2.9 

9.8 

2.9 

9.8 

4.3 

12.3 

4.6 

13.5 

3 

COR 

.9 

3.8 

1.0 

4.2 

1.0 

4.2 

1.3 

5.1 

1.6 

5.3 

UNC 

2.1 

7.6 

3.9 

11.1 

3.9 

11.1 

6.1 

16.2 

6.7 

16.7 

4 

COR 

1.1 

4.8 

.9 

3.9 

.9 

3.9 

.7 

3.5 

.8 

3.8 

UNC 

4.3 

12.6 

4.8 

13.5 

4.8 

13.5 

5.9 

16.0 

6.4 

16.6 

5 

COR 

1.4 

5.5 

l.l 

4.6 

1.1 

4.6 

1.1 

3.8 

1 . 1 

4.0 

UNC 

6.8 

17.2 

8.4 

20.0 

8.4 

20.0 

10.1 

23.2 

10.1 

23.2 

6 

COR 

.9 

3.4 

1.0 

3.8 

1.0 

3.8 

2.3 

6 . 5 

2.4 

6. ft 

UNC 

7.5 

18.1 

14.5 

28.4 

14.5 

28.4 

23.8 

39.7 

26.4 

43.3 

7 

COR 

.8 

3.5 

1.1 

3.7 

1.1 

3.7 

2.5 

6.0 

2.1 

O  .  • 

UNC 

10.4 

22.5 

20.3 

35.3 

20.3 

35.3 

31.4 

48.3 

33.4 

50 . 0 

8 

COR 

.7 

3.5 

.6 

3.2 

.6 

3.2 

.7 

3.3 

.8 

3.6 

UNC 

1.14 

4.9 

1.0 

4.7 

1.0 

4.7 

1.2 

5.0 

1.33 

5.3 

TABLE  2:  Critical  values  (%)  computed  for  KSC  (COR)  and  KSe  (UNO  using 
equations  (3.1)  and  (2.19) 


11 


TABLE  3 


Power 

P)  of  corrected  K$ 

statistic  for 

non-normal 

time- series 

DISTRIBUTION 

A 

B 

C 

a:  1.0 

5.0 

1.0  ~ 

5.0 

1.0  ' 

5.0 

EEG 

N 

I 

3.1 

8.0 

1.4 

6.3 

l.n 

A. 8 

64 

8.4 

16.5 

2.9 

8.1 

2.7 

8.3 

12.8 

20.9 

31.3 

5.6 

12.8 

4.4 

l?.l 

256 

39.Q 

52.0 

10.0 

18.8 

7.7 

18.1 

812 

65.0 

75.8 

17.0 

28.5 

15.4 

29.8 

2 

32 

2.9 

8.2 

1.6 

6.0 

1.5 

5.8 

64 

8.7 

17.1 

3.0 

8.4 

2.4 

7.9 

128 

20.3 

30.3 

6.0 

13.2 

4.0 

12.5 

256 

39.4 

51.1 

9.6 

18.6 

8.2 

18.5 

512 

63.5 

74.3 

16.1 

27.3 

16.0 

29.9 

3 

32 

3.0 

7.6 

1.3 

5.1 

1.5 

5.5 

64 

8.0 

15.5 

2.6 

7.5 

2.3 

6.9 

128 

21.1 

30.2 

6.0 

12.6 

4.7 

11.8 

256 

37.4 

48.3 

9.7 

18.0 

9.8 

19.7 

512 

60.5 

70.9 

14.8 

24.7 

19.3 

34.3 

4 

32 

10.5 

19.8 

4.4 

10. 9f 

5.5 

14.3 

64 

20.7 

31.4 

6.8 

13. 7* 

9.2 

20.7 

128 

35.2 

47.3 

9.9 

17.7 

18.9 

35.5 

256 

57.8 

69.2 

16.1 

26.5 

41.4 

62.5 

512 

84.2 

91.2 

25.7 

39.3 

77.5 

90.4 

5 

32 

9.0 

17.6 

4.0 

10.6 

4.5 

12.6 

64 

14.6 

24.0 

5.3 

12.0 

6.2 

15.0 

128 

28.0 

39.1 

7.8 

14.8 

10.8 

22.6 

256 

47.5 

58.9 

11.8 

20.2 

22.3 

40.5 

512 

72.0 

81.9 

17.7 

28.9 

48.6 

68.5 

6 

32 

3.1 

8.4 

2.0 

5.7 

1.7 

5.7 

64 

6.3 

12.6 

2.6 

6.5 

2.2 

6.7 

128 

16.9 

25.4 

6.5 

12.2 

5.8 

12.5 

256 

32.9 

43.0 

10.7 

17.7 

10.9 

20.8 

512 

54.1 

64.2 

14.9 

23.4 

21.5 

36.4 

7 

32 

3.8 

8.7 

1.8 

5.5 

1.6 

5.2 

64 

5.4 

11.1 

2.4 

6.3 

2.1 

6.2 

128 

14.8 

22.7 

5.7 

11.2 

5.2 

11.4 

256 

29.7 

38.4 

9.3 

15.7 

9.7 

17.7 

512 

48.9 

58.3 

12.8 

20.9 

18.6 

32.5 

8 

32 

43.9 

58.4 

18.3 

32.3 

92.7 

98.4 

64 

71.5 

83.5 

34.3 

51.3 

100. 

100. 

128 

94.5 

98.0 

62.8 

78.9 

100. 

100. 

256 

99.9 

100. 

91.0 

96.9 

100. 

100. 

512 

100. 

100. 

99.8 

100. 

100. 

100. 

t  These  entries  were  replicated  once  with  the  following  results: 
N=32,  4.4%,  10.9%;  N=64,  5.6%,  12.3% 


i 


TABLE  4 

Empirical  significance  levels  P )  for  corrected  and 
_ uncorrected  KS  using  decimated  data _ 


N: 

256 

128 

64 

32 

i : 

1.0 

5.0 

1.0 

5.0 

1.0 

5.0 

1.0' 

E’.O 

EEC 

KS 

1 

COR 

1.7 

6.1 

.7 

3.7 

,  7 

3.6 

.7 

3.8 

UNC 

2.2 

7.5 

1.0 

4.0 

1.1 

4.0 

1.0 

5.0 

2 

COR 

1.5 

6.1 

.8 

3.8 

.7 

3.4 

.7 

3.7 

UNC 

2.5 

8.0 

1.3 

5.4 

1.2 

4.8 

1.0 

4.7 

3 

COR 

1.0 

4.2 

.5 

2.7 

.5 

2.0 

.8 

3.8 

UNC 

2.6 

8.0 

1.3 

5.6 

1.1 

4.0 

1.2 

5.0 

4 

COR 

.4 

2.7 

_  r, 

3.1 

.7 

3.3 

.7 

3.8 

UNC 

1.5 

6.2 

1.0 

4.6 

1.1 

4.0 

1.2 

4.9 

5 

COR 

.6 

2.0 

.6 

2.8 

.7 

3.4 

t  7 

3.8 

UNC 

2.3 

8.4 

1.2 

4.0 

1.2 

5.1 

1.1 

5.0 

6 

COR 

1.1 

4.0 

.4 

2.2 

.3 

2.5 

.  7 

3.2 

UNC 

8.5 

20.5 

2.6 

9.0 

1.2 

5.6 

1.0 

4 . 5 

7 

COR 

1.1 

3.« 

3.6 

.4 

2.8 

.  6 

3 . 2 

UNC 

12.1 

25.2 

2.5 

11.0 

1.6 

5.0 

1.0 

4.5 

8 

COR 

.9 

4.0 

.7 

3.4 

.6 

3.3 

n 

3.7 

UNC 

1.6 

5.0 

1.2 

4.8 

1.0 

4.8 

l.n 

5.0 

TABLE  4:  Values  ■ 

computed 

for  KS 

c  (COR) 

and  KSe 

(UNC) 

usi  ng 

normal 

samples  of 

size  ! 

512  wi th 

every 

second , 

fourth , 

eighth 

and  ! 

sixteenth 

point  selected  for  computation. 


1 3 


NORMALIZED  SPECTRAL  DENSITY 


NORMALIZED  AUTOCORRELATION 


FIGURE  1  SPECTRAL  DENSITY  AND  AUTOCORRELATION  FUNCTIONS  FOR  SIMULATED  EEG'S. 


14 


i 


FIGURE  2  AMPLITUDE  HISTOGRAMS  FOR  EACH  SIMULATED  EEG  -  DISTRIBUTION  COMBINATION  .  SAMPLE 
SIZE  IS  100.000  AND  100  INTERVALS  USED  FOR  HISTOGRAM.  A  NORMAL  PROBABILITY  DENSITY 
CURVE  IS  SUPERIMPOSED  FOR  COMPARISON. 


15 


DISTRIBUTION 


